Robust control chart for nonlinear conditionally heteroscedastic time series based on Huber support vector regression

This study proposes a control chart that monitors conditionally heteroscedastic time series by integrating the Huber support vector regression (HSVR) and the one-class classification (OCC) method. For this task, we consider the model that incorporates nonlinearity to the generalized autoregressive conditionally heteroscedastic (GARCH) time series, named HSVR-GARCH, to robustly estimate the conditional volatility when the structure of time series is not specified with parameters. Using the squared residuals, we construct the OCC-based control chart that does not require any posterior modifications of residuals unlike previous studies. Monte Carlo simulations reveal that deploying squared residuals from the HSVR-GARCH model to control charts can be immensely beneficial when the underlying model becomes more complicated and contaminated with noises. Moreover, a real data analysis with the Nasdaq composite index and Korea Composite Stock Price Index (KOSPI) datasets further disclose the validity of using the bootstrap method in constructing control charts.


Introduction
In this study, we develop a statistical process control (SPC) method for a nonlinear conditionally autoregressive heteroscedastic (GARCH) time series whose underlying structure is unspecified, possibly possessing noises.SPC has been developed during the past decades to improve the quality of products through the monitor and control of manufacturing processes [1].Generally, control charts are appreciated as predominant devices that utilize graphical descriptions of the characteristics of a process, which is designed to promptly signal the outof-control state of processes.The conventional charts are originally built for independent or uncorrelated observations, but these do not work adequately when they are correlated over time as the correlations among observations cause a high rate of false alarms, see [2][3][4][5][6][7].As a remedy, control charts that monitor dependent observations have been proposed in the literature, see [8][9][10][11], and further refer to [12] for more recent development.
In practice, however, constructing control charts to monitor GARCH-type time series can be quite challenging, especially when financial time series is targeted.This is so because in financial time series, the task of finding an elongated period of in-control state observations is not feasible, as the time series occasionally suffers from structural instabilities attributed to external socioeconomic factors.This implies that financial time series can become nonstationary if observed for an extended amount of time, and it is difficult to single out a specific model due to the unpredictability of such economic affairs.Although variants of parametric GARCH models were coined to estimate time-varying volatilities, they are commonly apt to violate the stationarity assumptions, needed for precisely estimating model parameters [13].In practice, accurately calculating residuals based on the presumed models is critical as they are directly harnessed in the construction of the control charts, and therefore, the misspecification of underlying models can be a serious issue in designing control charts.To alleviate this problem, practitioners have developed nonparametric machine learning-based control charts.For an overview, we refer to [14,15].Among those, we pay special attention to one-class classification (OCC)-based control charts, which are designed to identify the boundary of a specific class among all data points via learning from a training dataset.For relevant works, we refer to [16][17][18].
To design a flexible model that meticulously captures the structure of the conditional volatility and avoid the model assumptions, we root our method to the support vector regression (SVR), a variant of the support vector machine (SVM), revised for regression.As SVR can afford to approximate the nonlinearity between variables without knowing their underlying dynamic structure a priori, while concurrently implementing the structural risk minimization principle [19] that balances between the model complexity and the empirical risk [20], SVR has been accepted as a promising tool to estimate the volatility of time series.For a general overview of SVR, we refer to [21][22][23], and further refer to [24][25][26] for incorporating SVR to estimate time series models.Specifically, [27,28] used the extended GARCH model, named SVR-GARCH, to obtain residuals and construct both retrospective change point tests and monitoring methods.
However, as GARCH-type time series inherently suffers from instability as observations can become wildly explosive, monitoring methods that accommodate variants of parametricor SVR-GARCH models are often inefficient for constructing a reliable test.For instance, [29] fitted an AR(p) model to the obtained residuals and truncated excessively large residuals to prevent them from undermining the Type I error rate of the test.This phenomenon arises primarily because standard SVRs still can be susceptible to the outliers lying in the training dataset [30], which generally makes the residuals behave in a more correlated manner, while also rendering some of them to become excessively larger than the others.This deficiency can be severely detrimental against the accuracy of monitoring as it requires "well-behaved" in-control observations or residuals as a primary ingredient of time series monitoring.
To circumvent the problem, we integrate the robust variant of SVR to estimate nonlinear GARCH-type time series models.Among the variants of robust SVR in the literature, such as [31,32], we specifically employ the Huber SVR (HSVR) by [33] when estimating nonlinear GARCH models.HSVR not only allows to design more flexible models by providing methods to incorporate asymmetry, but also it is suitable for efficiently fitting GARCH models as it suppresses the effects of the explosive observations.Therefore, we also adopt the nonlinear GARCH model using HSVR, called the HSVR-GARCH model, to accommodate a broad class of nonlinear GARCH models.
Moreover, to facilitate the control chart to effectively reflect the sophisticated structure of time series, we propose a control chart that harnesses a nonparametric data description method.In particular, we utilize the support vector data description (SVDD) method, a version of SVM reconstructed to solve the OCC problems [34,35].SVDD provides a hypersphere boundary on the data points, aimed to contain the desired proportion of observations within that hypersphere of a minimum radius, which can detect the data point out of the hypersphere boundary as an anomaly.This approach has been taken by several authors and proved to be useful in practice [16,[36][37][38][39].More recently, [40] considered a control chart based on the variant of one-class SVM that well detects mean shifts of a given process.Herein, we utilize squared residuals to train SVDD and determine the decision boundary of in-control time series as a control limit.
The rest of this paper is organized as follows.Section 2 summarizes two variants of SVR, namely, HSVR and SVDD.Section 3.1 elaborates the specification and the process of fitting HSVR-GARCH model alongside with its advantages, while Section 3.2 enumerates the procedure to train the OCC-based control chart using SVDD.Section 4 conducts Monte Carlo simulations to evaluate the performance of control charts that utilize HSVR-GARCH residuals in various circumstances, including the case that observations contain innovational or additive outliers.Our simulation experiments also evaluate the performance of control charts with HSVR-GARCH residuals when they are trained with the samples generated through a bootstrap method.Section 5 performs a real data analysis using the Nasdaq composite index and the Korean stock price index (KOSPI) to further solidify the validity of using bootstrap samples obtained from HSVR-GARCH models.Finally, Section 6 provides the concluding remarks.

Machine learning methods
This section introduces two machine learning methods, namely, the HSVR-GARCH model and the support vector data description (SVDD) method that comprise the control chart for monitoring nonlinear GARCH-type time series.We first provide a concise overview of these learning methods and then elaborate the procedure to apply those in constructing our monitoring schemes in the subsequent sections.

Huber SVR
Support vector regression (SVR) is a statistical learning method that originated from [41], which aims to precisely estimate the structure of a given dataset nonparametrically.Unlike many traditional parametric time series models, the strength of estimating time series model using SVR arises from its innate versatility to model dataset whose underlying structure is unknown or contains a severe degree of nonlinearity.This strength is particularly beneficial especially when estimating the structure of time series with heteroscedastic volatility, such as financial indices, as conditional volatility is unobservable in general and can be complex due to the interdependent nature of the current global economy.Moreover, utilizing SVR to estimate the structure of the time series can help to simultaneously avoid deploying a sophisticated parametric model, such as the family of exponential GARCH (EGARCH) models [42].
In this study, we utilize the Huber SVR (HSVR) when modeling nonlinear GARCH models to escalate the robustness.HSVR is advantageous over the standard SVR as it can effectively offset the effects of sporadic outliers, frequently observed in time series.Although many versions regarding HSVR exist in the literature, we utilize the �-insensitive HSVR proposed by [33,43].
Given a set of training observations {(x i , y i )} (i = 1, . .., n, x i 2 R m ; y i 2 R), HSVR seeks to find a function of the form: where x 2 R m , A 2 R n�m is a matrix whose j-th row consists of x T j (j = 1, . .., n), w 2 R n and b 2 R are parameters to be estimated, and , ϕ 1 (x 2 )i for some predefined kernel K. Here, h�, �i and ϕ 1 (x) respectively denote the inner product of the feature space and the implicit kernel operator corresponding to the kernel K 1 .
At first glance, (1) seems to deviate from the function form of the standard SVR, which is given as, where K and ϕ is defined analogously to K 1 and ϕ 1 , respectively.However, we can see that ( 1) and ( 2) are actually identical, which implies that HSVR can be viewed as a direct extension of the standard SVR.This is because by using the definition of K 1 (x, A T ), (1) can also be rewritten as where w j is the j-th element of the vector w.Therefore, analogous to SVR, estimating (1) using HSVR can be understood as implicitly performing the following two-stage procedure: (i) first map the given observations (x 1 , � � �, x n ) onto some feature space using the prescribed mapping function ϕ 1 to produce (z 1 , � � �, z n ) ≔ (ϕ(x 1 ), � � �, ϕ(x n )), (ii) then estimate the linear function on that feature space that best describes the mapped observations z i .For more details, we refer to [20,28] for the standard SVR and [33] for HSVR.
To estimate the model parameters w and b in (1), we solve where C 1 is a tuning parameter that balances between the model risk and the flatness of the estimated function, and L is an �-insensitive Huber loss function defined as follows: where x + ≔ max(x, 0) for x 2 R, and �, γ � 0 are tuning parameters that determine the degree of robustness, see Fig  As depicted in Fig 1, �-insensitive Huber loss function returns zero when x 2 [−�, �], increases quadratically in (−γ, −�] and (�, γ], then increases linearly otherwise.Moreover, in a statistical perspective, the optimization problem (3) attains the shrinkage estimator regarding w and b rather than only w, as seen in the traditional SVR models.This alteration is employed to derive a unique solution of w and b in solving (3) while simultaneously maintaining the strong convexity, refer to [44].
Despite the similarities between HSVR and SVR, the former deviates from the latter in terms of the way of solving the given optimization problem.Indeed, HSVR directly solves the primal problem by taking an iterative approach.To attain the solution, we first explicitly expand (3) by substituting (4), then reparametrize the problem with respect to z ≔ (w T , b) T : , A is a matrix of explanatory variables, whose i-th row is defined as x T i , and 1 n 2 R n is a vector of ones.As the objective function is strongly convex with respect to z, we attain the optimality by obtaining the solution of the first-order condition, that is, @LðzÞ @z ¼ 0: Therefore, (5) reduces to the problem of the following iterative method that computes the updated solution z i+1 from the previous iteration z i : for i = 0, 1, . ... The general process of using the aforementioned HSVR for GARCH models is presented in Section 3.

Support vector data description algorithm
To formulate the control chart that integrates the one-class classification algorithm, we use the support vector data description (SVDD) model in this study, which is a complex of support vector machine (SVM) and the data description algorithm proposed by [35,45].SVDD is a method to find a boundary that encompasses given observations with a single hypersphere in the feature space.Because the proportion of observations inside the hypersphere can be settled in advance, we can readily employ SVDD as an anomaly detection method.For a general overview, we refer to [36].
The fundamental concept of SVDD stems from constructing a hypersphere with a minimum possible volume that includes the desired number of observations within such a boundary.Namely, given the training dataset x i 2 R p (i = 1, . .., n), SVDD solves the following optimization problem with respect to the center a 2 R p and the radius R � 0 of the hypersphere: where ξ i � 0 (i = 1, � � �, n) denotes a slack variable that penalizes observations lying outside the hypersphere, C 2 � 0 is a tuning parameter which controls the balance between the misclassification errors and the volume of the hypersphere, and ϕ 2 (x) is an implicit kernel operator that is defined analogously to the kernel function K 1 in Section 2.1, namely, In practice, the following Gaussian kernel function is widely selected: where κ > 0 is a tuning parameter that determines the complexity of the decision boundary.
In practice, the decision boundary becomes more nonlinear and sophisticated when κ is small.The general procedure of solving ( 6) is similar to that of the classical SVM.To elaborate, we first construct the unconstrained problem as below: where α i � 0 and γ i � 0 (i = 1, . .., n) are Lagrange multipliers.Then, by using the Karush-Khun-Tucker conditions, we obtain the following equivalent dual problem with respect to α i : Notice that γ i vanishes due to the relationship α i = C 2 − γ i , (i = 1, . .., n), which is obtainable using the first-order conditions.Using the optimal α i of the problem (9), the radius R 2 can be calculated by the formula: where x sv denotes one of the observations that is precisely located on the boundary of the constructed hypersphere.Similarly, we measure the kernel distance between a new observation z 2 R p and the center a by computing and z is classified as an anomaly when D 2 is larger than R 2 .A primary reason why SVDD can be deployed in constructing a control chart is the usage of kernels, which enables practitioners to design a tailored control chart suitable to their needs.Indeed, the control limit of the chart can be determined by the tuning parameter C 2 in (6), as it can be exploited to precisely specify the rate of the misclassified datasets.To illustrate, by adjusting C 2 , we can reset the upper bound ρ > 0 of the misclassification error rate through the following relationship: and it is known that ρ matches the misclassification rate when the number of observations are sufficiently large and the distribution of the dataset is continuous [45].Control charts using SVDD can be remarkably favorable when the underlying structure of the dataset is sophisticated, as SVDD seeks the boundary from a higher-dimensional feature space that can be further adjusted using κ in (7).Therefore, tuning C 2 is analogous to adjusting the control limit of control charts, see Sections 3 and 4 for more details.

OCC-based monitoring scheme
In this section, we summarize the general procedure of constructing the OCC-based control chart.The procedure is divided into two parts: first fitting the HSVR-GARCH model to obtain residuals, then deploying SVDD to determine the control limit.

HSVR-GARCH model
We extend the GARCH model to capture the nonlinear structure of the conditional volatility specified as follows: where f is an unknown nonlinear function with the form in (1), which would be obtained by solving (3), s 2 t is the conditional volatility at time t = 1, . .., n, and � t are independent and identically distributed (iid) errors with E(� 1 ) = 0 and Var(� 1 ) = 1.Observe that, as s 2 t is a (conditional) variance, both s 2 t and σ t must be nonnegative by definition.Model (12) includes a broad class of stationary GARCH time series models.For a class of parametric GARCH models, [46] studied the estimation of their parameters and presented a method of conducting the change point test as an application.Their research later offered a theoretical foundation that based the OCC control charts of [47].
Although it seems plausible to directly deploy HSVR to model obtain the function f in (12), it is actually impractical because HSVR does not automatically guarantee the conditional variance s 2 t to be nonnegative.To rectify the problem, we modify the above model by taking the logarithm to ensure the nonnegativity of s 2 t as follows: As s 2 t is not observable in most practical circumstances when estimating g in ( 13), we substitute s 2 t in (12) with for t = 1, . .., n, and s � 1 denotes a degree of smoothness.When t < s, ( 14) is defined as t is one of the most widely accepted proxy of s 2 t and can be readily deployed when predicting the conditional volatility [48] and constructing monitoring schemes [28].In this study, we set s = 20, as it renders the ŝ2 t to be stabilized around the level of the unconditional variance, thereby enhancing the quality of residuals utilized in the formulation of control charts.
Another repercussion due to the unknown s 2 t is that an adequate source, purposed to compare with ŝ2 t for computing the empirical loss, is absent when aiming to determine the optimal set of tuning parameters, as ŷt is always estimated to be zero for pure GARCH-type models.This definitely prohibits practitioners from using standard loss functions, such as the mean absolute error (MAE), when performing the tuning parameter optimization.We overcome this problem by introducing a likelihood-based loss ℓ(g) for a function g with a regularization term regarding w.Specifically, by temporarily assuming that the error term � t is normally distributed, we consider the negative log-likelihood of g in ( 13) as follows: Then, by appending the regularization term regarding w to prevent overfitting, we obtain the following loss function [49]: where δ � 0 is a tuning parameter that balances between the model's flatness and the goodness of fit.This approach is conceptually similar to that of the quasi maximum-likelihood estimator (QMLE), which is the standard method of estimation in parametric GARCH models [13].
The general procedure to fitting a HSVR-GARCH model is based on the sequential k-fold cross-validation method for time series.To elaborate, we follow the steps below: Step 1. Partition the time series (x t , y t ) into chunks of k � 2, and denote the time series of the m-th chunk as ðx ðmÞ t ; y ðmÞ t Þ (m = 1, � � �, k).Moreover, let n m be the number of observations in the chunk m.
Step 3. Using the m-th chunk, compute the loss in (15) for the m-th chunk and denote the result as Step 4. Repeat Steps 2 and 3 for m = 2, � � �, k.
Step 5.The loss function that determines the optimal tuning parameter is finally defined as and the best set of tuning parameters is then set to those that minimize ' * ðĝ Þ.

This process of obtaining ŝ2
t via HSVR-GARCH models is encapsulated in Algorithm 1 below.
The set of tuning parameters for HSVR-GARCH model consist of four elements: (i) the regularization parameter C in (3), (ii) two parameters that comprise the �-insensitive Huber-loss function � and γ, (iii) the kernel tuning parameter s 2 for the Gaussian kernel in Section 2.1 and (iv) the regularization parameter δ for the likelihood-based loss (15).In this study, we initially fix δ, then seek for the remaining tuning parameters that minimize Lðĝ Þ via the particle swarm optimization.The specifications regarding the tuning parameters are presented in Sections 4 and 5 below.
One of the prominent advantages of utilizing the HSVR-GARCH model over the SVR-GARCH model of [27] is that it significantly stabilizes the estimated volatility, thus liberates from the necessity of any posterior treatment of residuals when constructing the control chart.Due to the unpredictable nature of GARCH models, conditional volatility is prone to have abrupt anomalies, even when no structural breaks exist.As such, one of the innate drawbacks of the SVR-GARCH model is that it occasionally returns incorrectly estimated conditional variance, as witnessed in Fig 2, which results in the underestimation of s 2 t in regions where the time series is relatively stable.This may prevent practitioners from directly utilizing the residuals based on the SVR-GARCH model because some residuals are computed to be explosively large.This phenomenon might have occurred as the SVR-GARCH method uses the mean absolute error-type loss as shown below that compares the estimated ŝ2 t with the proxy s2 t in ( 14) in optimizing tuning parameters: where k and m respectively denote the length of the training and the total length of the time series.In fact, two aforementioned figures concurrently depict that ŝ2 t estimated from the SVR-GARCH model does not heavily depart from s2 t , and fails to resemble the true s 2 t in some occasions.On the other hand, by employing the likelihood-based loss in (15) with an optimal set of tuning parameters, the HSVR-GARCH method can escape from comparing ŝ2 t against the proxy variables which were already deployed in obtaining ĝ .This alteration substantially stabilizes the estimation process producing ŝ2 t , which more closely resembles the true s 2 t , as reflected in Fig 2 , and thereby, results in producing well-behaved residuals.

Constructing OCC-based control chart
In this study, OCC-based control chart refers to the control chart that exploits the properties of ρ 2 (0, 1) in (11) mentioned in Section 2.2 to prescribe a desired level of the control limit.Given model residuals �t ¼ y t =ŝ t and a set of tuning parameters (ρ, κ), where ŝt is estimated by using any suitable estimation method, we fit SVDD with the Gaussian kernel presented in (16) to the squared residuals �2 t obtained from the training time series, and regard R 2 as the control limit.Moreover, when a new time series y n+k (k � 1) is observed, we obtain its squared residuals �2 nþk and then measure the distance Dð� 2 nþk Þ 2 defined in (10) against the origin of hypersphere for each k, and declare the process out-of-control if it exceeds R 2 .
The control limit is determined by leveraging two tuning parameters, namely, the inclusion rate ρ and the kernel tuning parameter κ, which settle the in-control average run length (ARL) to a desired level.To find the optimal tuning parameters, we use a method that hybridizes the standard grid searching method and quasi-Newton methods, such as the limited-memory BFGS algorithm [50].This is because solely relying on the latter method or other first-order optimization methods may render tuning parameters not to converge when the provided length of the training time series is insufficient.
Although the general procedure to construct the OCC-based control chart in this study resembles those of [47], our OCC-based control chart is designed to preserve the original structure of the given time series compared to the latter.In implementation, OCC charts of [47] must sequentially perform the Yeo-Johnson transformation and the moving-average smoothing to the residuals to alleviate the instability caused by the inaccurate estimation of the model parameters.Despite these efforts, OCC charts had a relatively inferior performance in terms of ARL 1 compared to other charts, such as CUSUM or EWMA charts, when monitoring the conditional volatility using squared residuals, see Section 4 of [47].Furthermore, as our simulation study indicates, smoothing residuals undermines the overall detection ability when the underlying model is contaminated with noises, see Section 4 of this paper for more detail.We actually suspect that such posterior manipulations on the residuals ignited this phenomenon by contaminating the information that the time series originally contained.However, as our OCC-based chart directly utilizes the squared residuals without any alterations, it not only facilitates the model residuals to retain the original structure, but also enhances the detection ability, as witnessed in the simulation results in Section 4.
The gist of constructing the OCC-based control chart, given a time series and some adequate models, is summarized in Algorithms 2 and 3 below.Specifically, Algorithm 2 describes the procedure of implementing SVDD for building an OCC-based control chart, given two tuning parameters ρ and κ.Meanwhile, Algorithm 3 provides an instruction to formulate OCC-based control charts using SVDD.
Remark 1 Under the circumstance in which obtaining copies of in-control time series is infeasible, we can empirically optimize tuning parameters by performing a wild-bootstrap method as specified below:   �t;j z t;j =ŝ t;j 14:

Simulation results
This section assesses the performance of control charts that utilizes HSVR residuals in various nonlinear GARCH models.The first subsection describes the settings of the experiment in more detail, and the subsequent subsection reports the performance measured in terms of the average run length (ARL).

Specifications of the experiment
For the experiment, we consider the cases where the underlying models are GJR-GARCH(p, q) [51] and log-GARCH(p, q) specified below: where ω � 0 and {� t } is an iid random process.Here, the log-GARCH(1,1) model is a variation of the exponential GARCH model (i.e., EGARCH(p, q); see [42,52]), which has a high degree of nonlinearity.In this experiment, we set p = q = 1 and set (ω, α 1 , α 2 , β) = (0.3, 0.1, 0.5, 0.3) for the GJR-GARCH model, and (ω, α, β) = (0.3, 0.3, 0.3) for the log-GARCH model.Moreover, to further reflect the circumstance where the underlying model is highly volatile and unstable, we additionally consider the cases where the initial distribution of � t in the training time series is specified as follows: • Case 1. � t * N(0, 1), the standard normal distribution; • Case 2. � t * t(5), t-distribution with 5 degrees of freedom; The latter three distributions symbolize the scenarios in which the observed time series is either inherently heavy-tailed or occasionally unstable.In particular, the latter two distributions are variants of a normal mixture distribution that represents the underlying model being highly volatile due to the innovational outliers.Specifically, Z k and Z l, m respectively denote that the underlying structure of time series is systematically unstable and asymmetric, which are some traits that can be witnessed in various financial time series.In this study, we consider the case of k = 3 and (l, m) = (1, 2).
In addition to considering innovational outliers, we also contaminate the observed time series with additive outliers in some experiments.Namely, we consider the case where we observe y 0 t ¼ y t þ Z t rather than directly observing y t , where η t follows some prescribed distribution function.For this experiment, we respectively set η t analogous to Cases 1 and 4 for GJR-GARCH model, and to Cases 3 and 4 for log-GARCH model when additive outliers are taken into consideration in the model.
When comparing the performance, we evaluate ARLs of three control charts, namely, the proposed OCC-based control chart in Section 3, CUSUM chart, and EWMA chart, that uses squared residuals.Here, CUSUM chart refers to the control chart that utilizes where K = kσ, H = hσ, σ denotes the standard deviation of squared residuals �2 t obtained from the training time series, and μ denotes the mean of �2 t .On the other hand, EWMA chart denotes the control chart that uses with the upper and lower control limits computed as where λ 2 (0, 1) is a smoothing parameter, and Z 0 is defined to be the sample mean of �2 t .Notice that the control limit of CUSUM and EWMA charts are respectively controlled by tuning k � 0, h � 0, or L e � 0, to dictate control charts to achieve the desired level of ARL 0 .For an overview of these control charts, we refer to [53][54][55].
Moreover, as we regard the underlying structure of the time series is unknown, we compare the detection ability of control charts using HSVR-GARCH against those that utilizes the standard GARCH(1,1) model specified below: where ω, α, and β are all nonnegative, and α + β < 1 is fulfilled to ensure the stationarity.Notice that fitting GARCH(1,1) models to some other conditionally heteroscedastic time series is customarily accepted in the presence the model uncertainty, see [56].
The general procedure of the experiment is summarized as follows.We first generate a time series of length n for training either the HSVR-GARCH model stated in Section 3 or the standard GARCH(1,1) model to obtain the residuals.We subsequently use squared residuals �2 t to formulate the control chart, as they are solely designed to target the change of variance.We then sequentially generate 1,000 independent streams of time series without any structural changes in order to find adequate tuning parameters that fix ARL 0 to the desired level, for example, the frequently used 370, though this number must be reduced significantly in a practical situation, as mentioned later in the real data analysis.To evaluate the detection ability, we generate another 1,000 independent streams of time series that includes a structural change, then apply the control charts to obtain ARL 1 .
We set the length of the training time series as n = 2, 000 and examine ARL 1 of control charts when some parameters, innovational distribution of � t , or the magnitude of the additive noise experience a change, respectively.Particularly, we specify the way in which the structural change occurs in Tables 1-8 below, alongside with the respective tuning parameters for each used control charts.When fitting HSVR-GARCH to time series, we initially fix δ = 0.1 and the number of chunks k to 5, then obtain other tuning parameters by employing the particle swarm optimization method.

Experiment results
In this subsection, we report the results of three experiments as follows.First, we compare the performance of our proposed OCC-based control chart with that of [47] by comparing control charts constructed from either �2 t or �2 t , where the latter denotes the residuals employed in [47].For fairness, we consider the control charts constructed from HSVR-GARCH residuals for both models to avoid the model misspecification issue in this particular experiment.Finally, we present the performance of control charts constructed from wild-bootstrap samples introduced in Remark 1 of Section 3.2.For the latter experiment, we set the number of bootstrap samples as B = 1000.
Table 1 report the comparison of OCC, CUSUM, and EWMA control charts that utilize the squared residuals �2 t against those that uses �2 t , where the underlying model is GJR-GARCH (1,1) with only innovational outliers of Z 1,2 being present.Note that for OCC-based control charts, directly using �2 t yields significantly enhanced results compared to those using �2 t .Moreover, in most circumstances using �2 t , a small improvement is observed regarding the detection ability in the latter two control charts.We speculate that this phenomenon occurred because �2 t diluted the information contained in the residuals, therefore undermining the overall detection ability.Indeed, as HSVR-GARCH model possesses the robustness on the estimated conditional volatility, squared residuals also become stabilized, thereby allowing to circumvent the constraint of using heuristic methods that struggle to suppress numerical instabilities.The result also implies that our proposed OCC-based control chart is superior to that of [47] in terms of the ARL 1 performance.
Tables 2-4 depict the ARLs of control charts when the underlying model is GJR-GARCH model, where the experiment of the latter two tables particularly contains additive outliers of Cases 1 and 4, respectively.The overall performance of charts with HSVR-GARCH residuals is relatively superior especially when the dataset is contaminated with either innovational outliers or additive noises, although the detection ability of HSVR-GARCH-based control charts somewhat recedes when neither of them are present.Specifically, ARL 1 of CUSUM and EWMA

Table 1. A comparison of control charts using �2
t and �2 t when the underlying model is GJR-GARCH(1,1) with the specified parameters without any additive outliers.Results of the first two columns directly compares our OCC-based control chart with that of [47].charts with HSVR-GARCH residuals consistently detect a change which is 5-10 percent faster than counterparts.In particular, control charts with HSVR-GARCH residuals commonly exhibit stellar detection ability over their GARCH(1,1) residual counterparts when α j (j = 1, 2) experience a change.This finding validates HSVR-GARCH models to be highly effective in capturing a structural change when the time series is systematically more convoluted.Furthermore, unlike [47], our OCC-based chart is shown to compare well with the CUSUM and EWMA charts that use identical squared residuals in most cases and to benefit the most from using HSVR-GARCH residuals among the three control charts.Most notably, all results regarding GJR-GARCH model excluding the case of no outliers indicate the loss of the detection ability of the OCC-based control chart when GARCH(1,1) residuals are used.This phenomenon is most likely resulting from the ill-behaved residuals due to the model bias.Therefore, it is practically advised to jointly deploy the OCC-based control charts with the HSVR-GARCH model to avoid this model bias problem.
Tables 5-7 portray the ARL 0 and ARL 1 of control charts when time series follows the log-GARCH model.Note that the latter two tables specifically depict the dataset contaminated with additive outliers of Cases 3 and 4, respectively.Here, HSVR-GARCH model-based control charts are observed to be more competent, as they detect a structural change up to twice as fast especially when only innovational outliers are present.Indeed, the effect of model misspecification becomes more apparent as control charts with the parametric GARCH(1,1) residuals lose detection ability in some circumstances like the case of α changing from 0.3 to 0.5.Although HSVR-GARCH residuals-based charts do not outperform when the innovations structurally become heterogeneous, HSVR-GARCH residuals still continue to be a superior choice for OCC-based control charts as those residuals significantly shorten ARL 1 compared to the other two charts, while still providing a stable chart.
In addition, unlike the results of GJR-GARCH models, control charts in log-GARCH models without any external additive outliers is shown to be much more stable and immensely powerful, especially in the cases when ω or the innovational distribution of η t experience a change.In particular, when either α or β increases to make a time series to be nearly nonstationary, most control charts with GARCH(1,1) residuals are observed to respond relatively poorly or even fail to respond.In contrast, control charts with HSVR-GARCH residuals not only successfully capture a change under those circumstances, and, on average, detect them twice as fast in the case of the parameter changes.In a nutshell, these results all fortify the strength of constructing control charts with HSVR-GARCH residuals when the underlying model is nonlinear and possibly unknown.
Finally, Table 8 denotes the performance of control charts when they are trained with samples generated by the wild-bootstrap method for both HSVR-GARCH and GARCH(1,1) models, when no outliers exist.Although some degree of performance drop is noticeable in certain cases, the three control charts with HSVR residuals retained the strong detection ability across all settings.Bootstrap methods applied to HSVR-GARCH models can be advantageous particularly when ample samples of in-control time series are unavailable.Note, however, that the bootstrap method can be less effective, for example, when the observed time series is impaired with innovational outliers, as the bootstrap samples are generated from iid standard normal innovations.Therefore, in practice, bootstrap methods can be especially beneficial in realworld scenarios when the source of observations do not possess any serious structural noises.

Real data analysis
This section demonstrates the real-world performance of the control charts that utilizes HSVR-GARCH residuals by analyzing financial indices, namely, the log-returns of the Nasdaq composite index and Korea Composite Stock Price Index We regard the indices from June 2, 2014 to October 31, 2017 (863 observations) for the former index and from March 2, 2013 to October 31, 2019 (1,512 observations) for the latter as the pre-observed training time series, and regard the subsequent time series to be monitored afterwards.For simplicity, we denote the respective time series "Nasdaq" and "KOSPI".One of the prominent characteristics of both datasets is that their training set contains a number of abrupt fluctuations, thus is intrinsically unstable.This behavior can be also examined through the indices illustrated in Figs 3 and 4 and Table 9, where the skewness and the excess kurtosis depart from those of a standard normal distribution.Meanwhile, Fig 5 illustrates that both datasets do not contain significant autocorrelations up to lag 12, which validates fitting a pure GARCH-type time series to these datasets.Despite the instability, however, the estimated conditional volatility of HSVR-GARCH models in  The gist of the procedure regarding the analysis is as follows.We fit HSVR-GARCH model to the given dataset to obtain ĝ that estimates the conditional volatility, then compute 1,000 independent copies of bootstrap samples using ĝ as presented in Remark 1 of Section 3.2.Afterwards, we sequentially observe the indices by simultaneously using OCC, CUSUM, and EWMA control charts with HSVR-GARCH residuals to monitor a structural change.The decision boundary of the OCC-based control chart and the tuning parameters necessary for CUSUM and EWMA charts are all computed using bootstrap samples and optimized to have the ARL 0 of 200, rather than 370, because financial time series is highly infeasible to maintain its structural integrity for a long period of time due to various external or international socioeconomic affairs.This implies that it is generally ill-advised to set a large value of ARL 0 when constructing control charts for real-world financial time series.Therefore, practitioners are suggested to construct plural control charts, avoiding a possible biased result when monitoring financial time series.All unmentioned but remaining settings required for constructing the control charts, such as δ in (15), are identical to those of Section 4. Table 9, alongside with Figs 7 and 8 report the detected location of the change for Nasdaq and KOSPI.For Nasdaq, the optimal set of tuning parameters for OCC, CUSUM, and EWMA charts are respectively computed as (ρ, κ) = (0.9995, 0.05), (k, h) = (0.5, 5.3), and (λ, L e ) = (0.2, 3.4).Moreover, the detected location of a change for all three charts appear to be similar,  The period of the potential structural break is around the commencement of the global trade dispute between the United States and China, which started in late January, 2018.This incident can be considered as a decisive factor triggering the abrupt upsurge of the conditional volatility, as both nations continuously imposed retaliatory tariffs over the following months.In particular, the OCC-based control charts that signaled the change three days before the other two charts illuminate that OCC-based control charts, combined with HSVR-GARCH residuals, can be functional in circumstances where promptly notifying a change is critical.
On the contrary, for KOSPI, the optimal set of tuning parameters for OCC, CUSUM, and EWMA charts are obtained to be (ρ, κ) = (0.9995, 0.025), (k, h) = (0.5, 5.0), and (λ, L e ) = (0.2, 3.1), respectively.Also, analogous to the preceding analysis, all three control charts detected a volatility change on January 28, 2020.Indeed, Fig 4 depicts a noticeable decline of the logreturns after the change point.The date of change is precisely located around when the severity of COVID-19 outbreak was rapidly escalating in China, while the first patient was concurrently reported in South Korea.This finding illuminates that South Korean economy suffered doubly because of COVID-19, as this incident predates the crash of the global stock market due to COVID-19 that occurred in early March, 2020.Moreover, this result further connotes that the global economy actually received a warning preceding the significant collapse in the following months.

Concluding remarks
This study demonstrated the merits of using residuals obtained from HSVR-GARCH models when formulating control charts, and additionally proposed a significantly improved variant of the OCC-based control chart.Unlike other hybridized GARCH models in the literature, our HSVR-GARCH model uses a likelihood-based loss function to overcome the problem that arises because of the unknownness of true volatility s 2 t .It was observed that the squared residuals computed from the HSVR-GARCH model significantly bolstered the overall detection ability of all control charts including the OCC-based one, and was proven to outperform those using residuals from the parametric GARCH model, especially in circumstances where the underlying model is nonlinear, sophisticated, or contaminated with innovational or additive outliers.The monitoring method combining the squared residuals of the HSVR-GARCH model and the OCC-based control chart consistently and promptly detected a structural change even when the observed time series are heavily contaminated and unstable.We also verified the validity of using bootstrap samples obtained from the HSVR-GARCH model when constructing the OCC-based chart, which can be crucial practical in training control charts when a amount of in-control time series is not available, which is a usual case in the financial time series analysis.Despite a number of improvements having been made to the OCC-based control chart, we can claim to facilitate the OCC-based control chart to be even more powerful by providing a higher dimensional training dataset to SVDD, through embedding the time series or its residual process to some adequate lower-dimensional structure-preserving feature space.Due to its importance, this task is worth further investigation and remains our future research project.

(ω, α 1 , α 2 ,
Fig 6 is observed to resemble the estimates of the GARCH(1,1) model, and the residuals obtained from the former model is

1 .
(13)mate ĝ from the training time series y 1 , ..., y m , then recursively compute ŝ2 t using(13).When parametric models are employed, we instead estimate the model parameters;2.Generate an iid sequence of standard normal random variables of length n, namely, � bt , t = 1, ..., n, b = 1, ..., B, and construct a sequence of bootstrap samples y b Based on �b2 t , fit SVDD and adjust its tuning parameters accordingly to obtain the control limit R 2 that sets the in-control ARL to the desired level.The validity of this method is discussed in the simulation experiments in Section 4. Also, this method is adopted when analyzing financial time series in Section 5, where only a small amount of training sample is available.Algorithm 2 SVDD b for OCC-based control charts N streams of in-control time series {z tj } t�1 (j = 1, . .., N), a kernel K 2 , s � 1, a desired level of in-control ARL c*, a tol- b Algorithm 3 OCC-based control chart for nonlinear GARCH models Input: A training set fy t g n t¼1 , an estimated model ĝ,

Table 7 . ARLs of control charts when the underlying model is log-GARCH(1,1) with the specified parameters, where the dataset is contaminated with a Z 1,2 -distrib- uted noise.
GARCH residuals indicates no significant autocorrelations up to lag 12 on both datasets at the nominal level of 0.05, with p-values of 0.0724 and 0.1118, respectively.Most notably, although the training set of Nasdaq seemingly contain a period of the increased volatility, the Ljung-Box test does not reject the null for the residuals of Nasdaq, which is a strong indication that HSVR-GARCH model can successfully capture the conditional volatility in these highly volatile real-world circumstances.

Table 9 . Summary statistics of the training time series, the result of the Ljung-Box test on HSVR-GARCH residu- als, and the location of the structural change regarding Nasdaq and KOSPI.
https://doi.org/10.1371/journal.pone.0299120.t009